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Global register allocation plays a major role in determining the efficacy « 
compiler. Graph coloring has been used as the central paradigm for regis 
modern compilers. A straightforward coloring approach can suffer from 
shortcomings. These shortcomings are addressed in this paper by colorin 
priority ordering. A natural method for dealing with the spilling emerges 
The detailed algorithms for a priority-based colori ... 
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Since the early days of logic programming, researchers in the field realiz 
exploitation of parallelism present in the execution of logic programs. Tl 
nature, the presence of nondeterminism, and their referential transparenc 
characteristics, make logic programs interesting candidates for obtaining 
parallel execution. At the same time, the fact that the typical applications 
programming frequently involve irregular computatio ... 
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In the last three decades a large number of compiler transformations for « 
programs have been implemented. Most optimizations for uniprocessors 
of instructions executed by the program using transformations based on i 
scalar quantities and data-flow techniques. In contrast, optimizations for 
superscalar, vector, and parallel processors maximize parallelism and mt 
transformations that rely on tracking the properties o ... 

Keywords: compilation, dependence analysis, locality, multiprocessors, 
parallelism, superscalar processors, vectorization 



4 An abstract machine for tabled execution of fixed-order stratified logic pro 
^ Konstantinos Sagonas, Terrance Swift 

May 1998 ACM Transactions on Programming Languages and System 
Volume 20 Issue 3 

Publisher: ACM Press 

Full text available: 1pdf(602.38 Additional Information: full citation , abst 

KB) citings, index ten 



http://portal.acm.org/results.cfm?CFID=8539854&CFTOKEN=6575... 1/8/07 



Results (page 1): +store +restore +instructions +register +deter... Page 3 of 10 



SLG resolution uses tabling to evaluate nonfloundering normal logic pr < 
the well-founded semantics. The SLG- W AM, which forms the engine of 
can compute in-memory recursive queries an order of magnitute faster tl 
deductive databases. At the same time, the SLG- W AM tightly intergrate 
tabled SLG code, and executes Prolog code with minimal overhead com] 
As a result, the SLG- W AM brings to logic progr ... 
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Speculative execution is an important source of parallelism for VLIW ar 
processors. A serious challenge with compiler-controlled speculative ex* 
efficiently handle exceptions for speculative instructions. In this article, ; 
features and compile-time scheduling support collectively referred to as 
is introduced. Sentinel scheduling provides an effective framework for b 
controlled speculative executi ... 
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We have built a system in which the compiler back end and the linker w< 
present an abstract machine at a considerably higher level than the actual 
intermediate language translated by the back end is the target language o 
compilers and is also the only assembly language generally available. Tr 
intermodule register allocation, which would be harder if some of the co- 
had come from a traditional assembler, out of sight of... 
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This paper describes the precise exception model of the MC881 10 symn 
microprocessor. The MC881 10 is a superscalar, pipelined processor that 
exection units and allows out-of order execution of instructions. The MC 
fully precise exceptions and presents the architecturally correct state to a 
handling routine in a manner that minimizes exception response latency, 
latency timings in the MC881 10 are described, and several ... 
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Modern processors employ a large amount of hardware to dynamically d 
single-threaded programs and maintain the sequential semantics implied 
The complexity of some of this hardware diminishes the gains due to pai 
longer clock period or increased pipeline latency of the machine.In this i 
processor implementation which dynamically schedules groups of instru 
executing them on a fast simple engine and caches them f ... 
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In recent years there has been an increasing trend toward the incorpor ati 
into a variety of devices where the amount of memory available is limite 
desirable to try to reduce the size of applications where possible. This an 
use of compiler techniques to accomplish code compaction to yield smal 
main contribution of this article is to show that careful, aggressive, interj 
optimization, together with procedural abstr ... 

Keywords: code compaction, code compression, code size reduction 



13 VLIW compilation techniques in a superscalar environment 
^ Kemal Ebcioglu, Randy D. Groves, Ki-Chang Kim, Gabriel M. Silberman, 
June 1994 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLA 
on Programming language design and implementation PLD] 
Issue 6 
Publisher: ACM Press 

Full text available: * Bpdf(1.30 Additional Information: full citation , abst 

MB) citings , index ten 

We describe techniques for converting the intermediate code representat 
program, as generated by a modern compiler, to another representation v 
same run-time results, but can run faster on a superscalar machine. The a 
novel parallelization techniques for Very Long Instruction Word (VLIW 
and place together independently executable operations that may be far a 
code, i.e., they may be se ... 
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Compiler optimization plays a key role in unlocking the performance of 
innovative dynamically-scheduled machine which is the first implement. 
PA 2.0 member of the HP PA-RISC architecture family. This wide supei 
order machine provides significant execution bandwidth and automatical 
runtime; however, despite its ample hardware resources, many of the opl 
transformations which proved effective for the PA-8000 served to au ... 
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The ability to execute a program in reverse is advantageous for shortenir 
paper presents a reverse execution methodology at the assembly instruct: 
memory and time overheads. The core idea of this approach is to general 
able to undo, in almost all cases, normal forward execution of an assemb 
program being debugged. The methodology has been implemented on a 
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Utilizing parallelism at the instruction level is an important way to imprc 
Because the time spent in loop execution dominates total execution time 
optimizations focuses on decreasing the time to execute each iteration. S 
is a technique that reforms the loop so that a faster execution rate is reali 
executed in overlapped fashion to increase parallelism. Let {ABQn 
Keywords: instruction level parallelism, loop reconstruction, optimizatic 
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Serialization of threads due to critical sections is a fundamental bottlene( 
performance in multithreaded programs. Dynamically, such serialization 
unnecessary because these critical sections could have safely executed c< 
locks. Current processors cannot fully exploit such parallelism because f 
mechanisms to dynamically detect such false inter-thread dependences. V 
Speculative Lock Elision (SLE), a novel micro-architectura ... 
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Trace scheduling is an optimization technique that selects a sequence of 
trace and schedules the operations from the trace together. If an operatio: 
basic block boundaries, one or more compensation copies may be requirt 
code. This article discusses the generation of compensation code in a tra< 
compiler and presents techniques for limiting the amount of compensatic 
(restricting code motion so that no compensatio ... 
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We observe a non-negligible fraction~3 to 16% in our benchmarks~of a 
instructions, dynamic instruction instances that generate unused results. ' 
these instructions arise from static instructions that also produce useful r« 
compiler optimization (specifically instruction scheduling) creates a sign 
these partially dead static instructions. We show that most of the dynam 
arise from a small set of st ... 
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